Mac Magazin/MacEasy 21

home *** CD-ROM | disk | FTP | other *** search

/ Mac Magazin/MacEasy 21 / Mac Magazin and MacEasy Magazine CD - Issue 21.iso / Wissenschaft & Technik / yorick_docs folder / yorick_docs / NOTES < prev next >

Wrap

Text File | 1996-02-28 | 19KB | 427 lines

Notes on the Yorick Interpretive Language ------------------------------------------------------------------------ 1. General description ---------------------- Yorick syntax is designed to imitate C syntax. This enables text editors which understand C syntax, such as GNU Emacs, to work without modification on Yorick code. However, Yorick handles variables more like LISP than C. This is primarily a consequence of the fact that LISP, like Yorick, is designed as an interpreter. A variable "remembers" its data type, so there is no need for type declarations in Yorick. They are replaced by the type conversion and array building functions. Variable scoping rules in Yorick strongly resemble LISP scoping rules. Each variable is either local or external with respect to the current function; an external variable will inherit the value of the most recent calling function for which it is local. Finally, Yorick array index and function call syntax resembles FORTRAN. Like FORTRAN (but unlike C), Yorick places little emphasis on the storage order of arrays; emphasizing instead the independence of the several dimensions of a multidimensional array. Yorick function calls are indistinguishable from indexed arrays -- and not simply syntactically indistinguishable as in FORTRAN, since a single Yorick expression of the form "x(i,j)" would result in a function call if x happened to be a function, but could later represent an indexed array if the data type of "x" changed from function to array. Unlike C and FORTRAN, Yorick is an interpreter. What you type will execute immediately without any further ado. Yorick recoups much of the penalty in execution speed associated with interpreters by providing a powerful vector syntax. You need very few loops in Yorick, since an expression involving an array "x" automatically applies to each element of "x" to produce an array result. An important side effect is that you need not pass array dimensions as arguments to procedures; every variable "remembers" its dimensions just as it "remembers" its data type. Yorick provides for both ASCII and binary input and output. Most of the functionality of the ANSI standard C library for ASCII I/O and string manipulation is available for use by interpreted routines. The binary I/O requires only that you be able to specify the layout of the file in the manner of a C struct statement; after you have done this, Yorick transparently moves data arrays from disk to memory and vice-versa. Like Stewart Brown's PDB library, Yorick supports automatic binary data type conversions to allow files written on one type of machine to be read on another, or to allow a program on one machine to write a file containing numbers in the format native to a different machine. Finally, Yorick is extensible. Special versions of Yorick may be created which include compiled code to do special purpose number crunching. The compiled code takes its inputs from arrays generated by the Yorick interpreter, and passes its results back to arrays which will be visible to interpreted code. 2. Language ----------- A Yorick program consists of either a single executable statement -- the "*main*" program -- or a function definition, or a data structure definition. 2A. Executable statements An executable statement is typically an assignment statement: x= 3; temp_variable = array(int, 3,4); quad_sol= (-b + sqrt(b^2 - 4*a*c))/(2*a) The trailing semicolon is optional -- if no semicolon is present, and expression would be complete and legal at the end of a line, Yorick supplies a semicolon. If the final character of a line is a backslash (\), Yorick will not automatically supply a semicolon. The intent is to allow you to omit typing trailing semicolons on lines entered from the keyboard. I strongly recommend you include the terminating semicolons explicitly in Yorick source code files, to enhance the resemblance with C. A second common executable statement is a subroutine call: read, a, b, c; print, (-b+sqrt(b^2-4*a*c))/(2*a), (-b-sqrt(b^2-4*a*c))/(2*a); do_something Notice that if the subroutine has arguments, its name is separated from them by a comma. This rule is necessitated by a third form of executable statement in Yorick: the implied print statement. Any statement which consists solely of an expression will print the value of that expression. Hence, instead of the above explicit call to the print function, you could have written: (-b+sqrt(b^2-4*a*c))/(2*a); (-b-sqrt(b^2-4*a*c))/(2*a) (The only difference is that a print with two arguments will print the results on the same line, while each individual expression which is printed via an implied print statement appears on a separate line.) Note also that explicit use of the semicolon delimiter allows you to stack several statements on a single input line. The following statement is ambiguous: x; If x is a function, this calls the function x with no arguments, then discards any result. If x is not a function, it is printed as an implied print statement. Yorick resolves this ambiguity at run time, just as it resolves the ambiguity between a function call x(i,j) and an indexed array x(i,j) at run time. The remaining executable statements in Yorick are control structures. Yorick supports C-style if/else conditionals and while/do/for loops. Yorick does not support the C switch statement. (The only way I could think of to implement switch would have been equivalent to an if/else chain). Yorick provides the C break and continue statements to break out of and abort a pass through a loop, respectively. Yorick supports the infamous C goto statement, which, as in C should be used only to break out of deeply nested loops, and possibly to branch to error processing code. In a word, use goto sparingly at most. The syntax for these is: if (condition_expression) if_statement if (condition_expression) if_statement else else_statement while (condition_expression) body do body while (condition_expression) for (init_expression ; condition_expression ; inc_expression) body The "if_statement", "else_statement", or "body" may be either a single executable statement, or a compound statement. A compound statement consists of several single statements enclosed in curly braces, as in: for (i=1 ; i<=n ; i++) { if (get_abc_data(i)) { print, "data at point"+swrite(i)+" unavailable"; continue; /* abort this pass of the enclosing for loop */ } root1= root2= 0.0; if (!a) { if (!b) break; /* exit the enclosing for loop now */ root1= root2= -c/b; } else if (b^2 > 4*a*c) { root1= (-b+sqrt(b^2-4*a*c))/(2*a); root2= (-b-sqrt(b^2-4*a*c))/(2*a); } else { print, "negative discriminant at point "+print(i)(0); break; /* exit the enclosing for loop now */ } print, root1, root2; } In addition to the use of compound statements, the for loop, and the if/else constructs, note these features of the above code fragment: (1) The = operator is a binary operator like + or *, with the side effect of redefining the left hand operand. Its value is the value which was assigned, so that root1= root2= 0.0; first redefines root2 to be 0.0. This operation has a value of 0.0, and root1 is then redefined to this value. (2) The + operator concatenates strings. (3) The swrite function converts array data into ASCII. If print is invoked as a function, it will also return a string array instead of printing the result at the terminal. (4) Yorick comments begin with /* and end with */, like C comments. Yorick also recognizes comments introduced with // and ending with newline (C++ style), or introduced with # and ending with newline (Unix shell style). Yorick will also ignore all lines between a line consisting of "#if 0" and a matching line "#endif", but no other C preprocessor directives, such as "#define" or "#ifdef", are recognized. (The philosophy of Yorick is that they should be unnecessary.) 2B. Function definitions A group of Yorick statements may be collected into a Yorick function, which can then be invoked either as a subroutine or as a function. All Yorick functions return some value, although a special value nil, [], is available to represent lack of data. The return value is discarded if the function is invoked as a subroutine. Because of the LISP-like variable scoping, Yorick functions can and should be used as macros: any variable not defined locally within the function will be retrieved from the context in which the function was called. Yorick functions do have local variables, and consequently can be used in the style of C functions as well. A Yorick function definition looks like this: func quad_solve(a, b, c) /* return the roots x of the quadratic equation a*x^2 + b*x + c = 0 */ { discrim= b^2 - 4*a*c; q= -0.5*(b+sign(b)*sqrt(discrim)); return [q/a, c/q]; } The square bracket operator [...] builds an array by concatenating its arguments. In this case, for example, quad_solve(1,0,-1) prints "[-1,1]", which are the two solutions of x^2-1=0. Or, we could save the result in the variable x with: x= quad_solve(1,0,-1) Then x(1) would be -1 and x(2) would be 1. (Yorick array indices are 1-origin by default -- more on this later.) Notice that Yorick silently converts the data types of the inputs (1,0,-1) from integer to floating point. The real power of the Yorick interpreter, however, comes from the fact that the input arguments to quad_solve can themselves be arrays. All of the operators =, ^, +, -, *, /, and even [...] will be applied to every element of the input arrays. Yorick's array operations rely on the notion of conformability between operands in binary operations. Conformability in Yorick follows a simple, universal rule: Two Yorick arrays are conformable if the lengths of their corresponding dimensions match exactly, or, failing an exact match, if one or the other dimension has length 1. If one array has more dimensions than the other, the "missing" dimensions of the shorter array are treating as dimensions of length 1. A binary operation between two conformable arrays returns a result which has the longer of the dimensions of its operands. (Note that adding a 1-by-15 array to a 20-by-1 array produces a 20-by-15 result by this rule.) The result is obtained by performing the binary operation on the corresponding elements of the left an right operands -- the single index of a 1-length dimension combining with every index value in turn of the corresponding index of the other operand. Note that scalar multiplication is a special case of this general rule. Hence, quad_solve(1,0,-[1,4,9,16]) prints "[[-1,-2,-3,-4],[1,2,3,4]]" -- a two dimensional result, while quad_solve([1,4](-,),0,-[1,4,9,16]) prints "[[[-1,-2,-3,-4],[-0.5,-1,-1.5,-2]],[[1,2,3,4],[0.5,1,1.5,2]]]" -- a three dimensional result. The indices represent, in order, (1) the solutions for c= 1, 4, 9, and 16, (2) the solutions for a= 1 and 4, and (3) the two roots of the equation. The special array index "-", used in the above example "[1,4](-,)", inserts a 1-length dimension into an array. Thus, "[1,4]" is a 1D array with length 2, and "[1,4](-,)" is a 1-by-2 2D array. Without the "(-,)", the call to quad_solve will cause a run time error: quad_solve([1,4],0,-[1,4,9,16]) ERROR (quad_solve) operands not conformable in binary * since the operation a*c in the first line of quad_solve would be [1,4] * (-[1,4,9,16]), an the length 2 vector is not conformable with the length 4 vector. However, [1,4](-,) * (-[1,4,9,16]) is a legal operation, producing the 2D result [[-1,-4,-9,-16],[-4,-16,-36,-64]]. The combination of the Yorick conformability rule and the "-" array pseudo-index is very convenient for dealing with the multidimensional arrays common in physical modeling. Yorick also supports functions with a variable number of arguments, and functions with keyword arguments. Ordinary positional parameters and keyword parameters may be used to output results; that is, if these parameters were supplied as a simple variable reference, then that variable will be redefined to the local value of the function parameter when the function returns. Thus, function parameters in Yorick have the "look and feel" of FORTRAN call-by-reference function parameters. (In reality, of course, every Yorick variable is a reference to a reference several levels deep -- Yorick must keep track of a lot of descriptive information in addition to the mere value of a variable.) 2C. Structure definitions A Yorick structure definition resembles a C structure definition, except that dimension lists are Yorick/FORTRAN style, and complicated C data types are not allowed. Roughly speaking, any C data type which requires parentheses (or the use of a typedef which requires parentheses) will not be legal in Yorick. The basic data types recognized by Yorick are: char short int long float double complex string pointer The first six have identical meaning with the data type of the same name in C (except that Yorick's "char" may mean C's "unsigned char"). The last three correspond to the following C typedefs: typedef struct { double re, im; } complex; /* in C, not Yorick */ typedef char *string; /* in C, not Yorick */ typedef void *pointer; /* in C, not Yorick */ Yorick's strings are either NULL (0) pointers, or 0-terminated strings of characters, normally used to represent text. Yorick's pointer data type may either be NULL (0), or point to a Yorick array (of any data type/structure) -- Yorick can NOT handle a fully general pointer to anywhere in memory in the spirit of a C pointer. Nevertheless, the VALUE of a Yorick pointer corresponds exactly to the VALUE of a pointer (of the appropriate type) referencing the same data in C. Hence, a Yorick pointer may be passed into a compiled C routine either directly, or as a member of a larger data structure without any behind-the-scenes tinkering. Unlike C, however, Yorick knows not only the type of the object referenced by a pointer, but also its dimensionality. Hence, when you dereference a Yorick pointer, you get an ordinary Yorick array, complete with its dimension information, not just a scalar. Yorick also keeps track of the number of references to its arrays, so memory is automatically freed when the last Yorick pointer referencing it disappears. Yorick arrays may also have compound types consisting of named members which are arrays of the basic types, or arrays of previously defined compound types. struct Mesh { long imax, jmax; double *x, *y; int *region; } x= Mesh(imax=31, jmax=11, x= &xarray, y= &yarray, region= &rarray); 2C. Comments Yorick comments begin with /* and end with */, as in C. A comment may span any number of lines or be only a small part of a single line. As far as the parser is concerned, a comment is equivalent to a single whitespace. Since this style of comment does not nest, Yorick also recognizes C++ style comments, introduced by // and terminated by a newline, and Unix shell style comments, introduced by # and terminated by a newline. Use one of these forms to comment out blocks of lines. 2D. Including files containing Yorick source code To have Yorick switch its input stream to an alternate file, foo.i, type the following line: #include "foo.i" Such an include directive may also appear as a line in foo.i itself. Unlike C, Yorick #include directives must be outside of any func, struct, or other control structures. Like C, Yorick accepts <foo.i> as well as "foo.i". Nothing else may be "stacked" on an #include line; it is NOT a Yorick statement. The ".i" suffix for Yorick include files is simply a convention; Yorick does not care about the name of the include file. However, I urge you to stick to this convention; otherwise you won't be able to tell which files in a directory are Yorick source, and other programs such as text editors are unlikely to be able to be trained to recognize your Yorick source files. If you must include files dynamically at runtime, Yorick also provides two functions -- include and require -- which parse and execute input files containing Yorick source code at runtime. To locate the file "foo.i" (assuming that "foo.i" is a relative pathname -- if it is an absolute pathname there is no question), Yorick follows a standard "include path": (1) Is foo.i in the current working directory? (2) Is foo.i in the ~/Yorick directory? (3) Is foo.i in the Y_SITE/include directory? (4) Is foo.i in the Y_SITE/contrib directory? (Here, Y_SITE is a site-wide Yorick home directory, set when Yorick is compiled.) The std.i file in Y_SITE is automatically included at startup. You can also place a custom.i file in your ~/Yorick file, which will be automatically included at startup. Be sure to copy the default Y_SITE/include/custom.i into your ~/Yorick directory and read the instructions therein if you want your own custom.i. 3. Getting help --------------- The basic Yorick functions are defined in the std.i file. These include things like dimsof, typeof, structof, and the like, which you can use to find out about variables. Also, the print function prints useful information about functions, structure definitions, I/O streams, and other non-array objects. Finally, there is a function "help", which will describe itself if you type "help". 4. How it works --------------- You can use the disassemble function to find out exactly how Yorick performs a function. Yorick parses its input into a series of virtual machine instructions. There is a program counter (the first column) and a stack pointer (depth indicated in the second column). Runtime error messages include the value of the program counter when Yorick blew up. > func sech(x) { y= tanh(x); return sqrt(1.0-y*y); } > disassemble, sech func sech(x) 3 sp+>1 PushVariable(tanh) 5 sp+>2 PushReference(x) 7 sp->1 Eval(1) 9 sp0>1 Define(y) 11 sp->0 DropTop 12 sp+>1 PushVariable(sqrt) 14 sp+>2 PushDouble(1) 16 sp+>3 PushVariable(y) 18 sp+>4 PushVariable(y) 20 sp->3 Multiply 21 sp->2 Subtract 22 sp->1 Eval(1) 24 sp->0 Return 25 sp==0 Halt-Virtual-Machine >